Mapping the correlations and gaps in studies of complex life histories

Abstract For species with complex life histories, phenotypic correlations between life‐history stages constrain both ecological and evolutionary trajectories. Studies that seek to understand correlations across the life history differ greatly in their experimental approach: some follow individuals (“individual longitudinal”), while others follow cohorts (“cohort longitudinal”). Cohort longitudinal studies risk confounding results through Simpson's Paradox, where correlations observed at the cohort level do not match that of the individual level. Individual longitudinal studies are laborious in comparison, but provide a more reliable test of correlations across life‐history stages. Our understanding of the prevalence, strength, and direction of phenotypic correlations depends on the approaches that we use, but the relative representation of different approaches remains unknown. Using marine invertebrates as a model group, we used a formal, systematic literature map to screen 17,000+ papers studying complex life histories, and characterized the study type (i.e., cohort longitudinal, individual longitudinal, or single stage), as well as other factors. For 3315 experiments from 1716 articles, 67% focused on a single stage, 31% were cohort longitudinal and just 1.7% used an individual longitudinal approach. While life‐history stages have been studied extensively, we suggest that the field prioritize individual longitudinal studies to understand the phenotypic correlations among stages.

Phenotypic correlations among life-history stages also affect evolutionary trajectories. The degree to which phenotypes in different life-history stages are evolutionarily independent remains the subject of intense discussion (e.g., the "adaptive decoupling hypothesis," Bonett & Blair, 2017;Ebenman, 1992;Moran, 1994;Sherratt et al., 2017). When traits among life-history stages are correlated, each stage can constrain the other from evolving to its optimum (Aguirre et al., 2014;Marshall & Morgan, 2011). The broad interest in understanding correlations across life-history stages has generated a wealth of empirical work (Harrison et al., 2011;Moore & Martin, 2021;O'Connor et al., 2014;Pechenik et al., 1998;Pechenik, 2006) but our understanding of correlations among certain life-history stages remains incomplete. An important step in developing the field and identifying priorities is to identify the aspects of complex life histories that have been relatively well studied, and to locate any knowledge gaps that remain.
Currently, it remains unclear whether all life-history stages have been studied equally or if emphases on particular stages exist.
For example, an informal reading of literature implies that correlations between the larval and juvenile stages are of particular interest in studies of marine invertebrates (Mendt & Gosselin, 2021;Phillips, 2002), insects (Carter & Sheldon, 2020;Moore, 2021;Moore and Martin, 2021), frogs (Green & Bailey, 2015;van Allen et al., 2010), and fish (Araki et al., 2009;Dingeldein & White, 2016). However, by focusing on a few stages, we may be missing key correlations that could regulate populations-for example, the egg stage may determine adult phenotypes and densities, and the effect of this correlation can last for several generations (Downes et al., 2021;Plaistow et al., 2006).
Identifying emphases in the literature would allow future studies to address which stages/correlations are less examined, and provide a more complete understanding of complex life histories and their evolutionary trajectories, such that no one stage remains a "black box." We also must consider how we study the life cycle-different experimental designs for studying life histories provide access to different inferences. Broadly, there are three experimental design approaches: (1) single stage, (2) cohort longitudinal, and (3) Figure 1a). Cohort longitudinal studies are often less finicky than individual longitudinal studies, particularly for very small organisms or those with high mortality rates. If one is interested in quantifying genetic correlations among life-history stages, and a quantitative genetics breeding design is used, the scale of replication is cohort from a single sire, and therefore cohort longitudinal approaches are appropriate (e.g., Aguirre et al., 2014). But cohort longitudinal studies are susceptible to Simpson's Paradox whereby trait relationships observed across cohorts may not reflect the trait relationships for individuals; F I G U R E 1 Hypothetical data depicting experimental design approaches used to study life histories (single stage studies not shown). (a) Cohort longitudinal studies follow groups (i.e., "cohorts") of individuals across stages. Points represent means ± SE for larval and adult traits of three cohorts. (b) Individual longitudinal studies follow individuals through multiple stages. A small point is an individual, and large points are the means ± SE of each cohort, depicted in panel a. The panels show an example of Simpson's Paradox, where the relationship between two traits across cohorts (a) is the opposite of the relationship observed across individuals within each cohort (b). To make inferences about phenotypic correlations among life-history stages within a species, the individual longitudinal approach is most appropriate. the trend at the individual level could even be opposite of that at the cohort level (Figure 1; note that among-species comparisons are also vulnerable to Simpson's Paradox). Thus, if one wishes to make inferences about phenotypic correlations among life-history stages, then individual longitudinal studies (i.e., following individuals) are most suitable, as they estimate trait (co)variances at the appropriate scale and avoid the potential for Simpson's Paradox (Figure 1b). However, individual longitudinal studies are potentially laborious-rearing individuals can be much harder than rearing a cohort, so we might expect individual longitudinal studies to be rare. Identifying the relative prevalence of each experimental design approach will help to identify the state of our knowledge and our capacity to make conclusions about correlations across the life history and the degree to which we are at risk of Simpson's Paradox (i.e., cohort longitudinal studies).
Another component of experimental design that must be considered is where the study is conducted. The nature, strength, and variability of correlations between life-history stages vary systematically between laboratory-based study and those done in the field (Monro et al., 2010). Controlled laboratory conditions make experimental manipulations easier and are sometimes the only way to examine certain life-history stages (Diamond, 1983). Nevertheless, field experiments provide information that cannot be gained from laboratory experiments alone (Reznick & Ghalambor, 2005). Estimating the ubiquity of field and laboratory studies should allow us to identify when field studies should be priority.
Here, we use a systematic map of the empirical literature on marine invertebrates to describe the state of our knowledge regarding phenotypic correlations across life-history stages, and outline the field's strengths and knowledge gaps. Systematic maps use a repeatable methodological framework to quantify what has been studied. Unlike systematic reviews and meta-analyses, systematic maps do not statistically analyze combined data from empirical studies (James et al., 2016;O'Dea et al., 2021). Instead, a systematic map collates, catalogues, and describes studies, outlining the current state of knowledge for a particular topic (James et al., 2016). Marine invertebrates are a good model group for mapping the current knowledge around correlations because of their numerous phyla, diverse life-history modes, and long history of study from the perspective of complex life cycles (MacBride, 1907;Mortensen, 1923;Thorson, 1950). We collected methodological data for studies of life histories across the following six stages: (1) F 0 adult, (2) embryo, (3) larva, (4) metamorph, (5) juvenile, and (6) F 1 adult ( Figure 2a) and recorded the experimental design used for each study.

| Objectives
The objective of this systematic map was to determine areas of focus in life-history studies using marine invertebrates as a model system.
The primary questions were as follows: • What life-history stages are most commonly studied?
• What experimental design approaches (i.e., single stage, cohort longitudinal, and individual longitudinal) are used?
• Are studies conducted in the laboratory or the field?

| Review team, inclusion criteria, and search strategy
We followed the six-stage Systematic Mapping Methodology (James et al., 2016). The review team consisted of four primary reviewers and one stakeholder-the stakeholder commissioned and shaped the scope of the systematic map; established search methods and We first established inclusion and exclusion criteria-life-history studies had to be empirical, use marine invertebrates, and measure at least one fitness-related trait (e.g., fecundity, survival, size, growth, development time)-for studies using more than one lifehistory stage, the traits measured in each stage need not be the same (Tables A1 and A2). We excluded theoretical, observational, and qualitative studies without empirical data, and empirical studies that did not measure a fitness trait (e.g., behavior, metabolism, etc.).
To establish a search protocol, we first scoped Web of Science using the simple search function to identify search terms that would yield ~10% of articles to be accepted for the map (for search terms that were tested but not used see Table A3). Our protocol was to use chosen search terms to screen articles at the title and abstract level, and then import those that were relevant to EndNote (version X8.2).
We then assessed the relevant full-text articles for eligibility-once articles were approved for the map, data were extracted and added to a database in Excel 2016.

| Searching, identification, and screening
Because we conducted four searches over 7 years, we had to slightly adjust our protocol each time to ensure that we were getting an unbiased sample of the literature, while still using methods that were logistically attainable (i.e., search terms that yielded a reasonable number of articles to be assessed). We used the four search strings provided in Table A4-all databases in Web of Science were used for searches 1-3, but only the Web of Science Core Collection was used for search 4 because searching all databases yielded too many articles for screening (>34,000 hits). In searches 1 and 2, we restricted the search to the journals Biological Bulletin and Marine Biology, respectively-we wanted to sample these journals because they have historically published studies on marine invertebrate life histories.
For the other searches, all journals in the Web of Science were included. All hits were screened at the title and abstract level, except or the search 2-we only assessed the first 1000 articles, because relevant articles were scarce thereafter.

| Coding and production of the map database
For the 1716 articles included, we recorded information on (1) species; (2) studies; and (3) references, and describe them below. Some articles had multiple studies, and thus have multiple rows in the database. Records were coded as having multiple studies if (1) more than one stage was investigated, but the stages were measured at different times and/or in different locations (e.g., two single stage studies for larvae and juveniles), or (2) if the same stage(s) were investigated in separate studies (e.g., one larval study that manipulates temperature and the other salinity). In total, we had data for 3315 studies.

| Species
We included phylum, class, and species for each study. We also recorded developmental mode: planktotrophic (i.e., planktonic, feeding larvae), lecithotrophic (i.e., planktonic, non-feeding larvae), or direct development (i.e., aplanktonic, crawl-away juveniles). We used the package "taxize" (Chamberlain and Szocs 2013) in R (v. 4.1.2) to search the Global Biodiversity Information Facility (GBIF) database to identify species names in the dataset that were synonyms-we refer to each species using just one name.

| Studies
For each study, we recorded the fitness traits measured for the lifehistory stages: F 0 adult, embryo, larva, metamorph, juvenile, and F 1 adult (Table A2; Figure 2a). We classified each study into one of three experimental designs: single stage, cohort longitudinal (i.e., multiple stages following cohorts), and individual longitudinal (i.e., multiple stages following individuals). We also recorded whether the study was conducted in the field or laboratory. Studies conducted partially in the laboratory and field were classified overall as field studies because quantifying phenotypes almost inevitably required some laboratory work.

| References
The full citation for each study is included.
After all data were extracted, reviewer 4 screened the entire map database to check for consistency and clarity across reviewers, meaning searches from 2014 and 2015 were double-screened and are consistent with the 2021 search.

| RE SULTS
Of the 3315 studies in the dataset, 30.9% followed cohorts and only 1.7% followed individuals through multiple stages of the life cycle ( Figure 2b). Studies were most commonly conducted in the laboratory (88.2%) and focused on a single stage (67.4%; Figure 2b).
Studies beginning with the metamorph or juvenile stages were rare across all experimental design methods-generally, studies most often began with measuring F 0 adults, embryos, and larvae ( Figure 3).
When we quantified what stages the longitudinal approaches measured the most, we found that studies following cohorts mostly measured the F 0 adult, embryo, and larval stages (Figure 4a Because species with direct development do not have freeswimming larvae and do not metamorphose, we analyzed those data in isolation. There were 260 studies that used species with direct development-most studies followed a cohort (46.9%) or focused on a single stage (51.9%) (Figures A1 and A2). Broadly, most studies began with measuring the F 0 adult stage ( Figure A3), and studies measuring sequential stages usually ended at the juvenile stage, meaning measurements of F 1 adults were rare ( Figure A4). However, there was one individual longitudinal study that measured all four stages ( Figure A4c).
We also compared the relative frequency of the three development modes in our dataset to their frequency reported in the compilation from Marshall et al. (2012). Studies of planktotrophic species were overrepresented in our literature map-they were used in 62.3% of studies. Compared to Marshall et al. (2012), lecithotrophic species were underrepresented in articles by 23%, and direct developing species by 39.5%.
The most common phyla studied were Mollusca (34.9%), Echinodermata (21.8%), and Arthropoda (13.1%), accounting for Following individuals through the life cycle was rare (1.7% of studies), and no study measured the entire life history. The field has focused on describing single stages of life histories-it is tempting to infer correlations between two stages, even when they are studied as separate cohorts. But comparing two cohorts is inappropriate and highlights that most studies cannot make inferences about phenotypic correlations between stages. Studies that measured more than one stage almost exclusively followed cohorts, and could potentially Importantly, we know other studies likely exist that meet some of our criteria (and some are even well-known studies), but were not captured by our systematic map. This highlights an important limitation to systematic maps that must be acknowledged: no map will be perfectly comprehensive, so missing some important studies is unavoidable. Expanding search terms further generated an impractical number of papers to process, as the 17,000+ papers that we did screen required 100 s of person-hours. Nevertheless, our map can be considered a representative and unbiased sample of the literature such that the relative abundance of different study types is unlikely to change, were broader terms used. Suffice it to say, regardless of  Aguirre et al., 2014;Cameron et al., 2017Cameron et al., , 2021Marshall, 2021).
Thus, we suggest the field move toward the individual longitudinal approach (Cameron et al., 2019;Clutton-Brock & Sheldon, 2010;Hoffmann & Sgró, 2011;Marshall, 2021, Schuster et al., 2021. While we encourage using the individual longitudinal method, we acknowledge why studying individuals in the life history is rare-long-term studies are costly, logistically difficult to maintain and, therefore, risky to undertake, particularly for long-lived species (Clutton-Brock & Sheldon, 2010). We found only one case in which all stages of the life cycle were measured-an individual longitudinal study that used a species with only four stages (i.e., direct development). Further, we found that cohort longitudinal studies measured the F 0 adult, embryonic and larval stages most F I G U R E 4 Summary of life-history stages measured in (a, b) cohort longitudinal studies (i.e., multiple stages measured, following groups of individuals) and (c, d) individual longitudinal studies (i.e., multiple stages measured, following individuals). Thickness of bars represents the number of studies that measured each stage combination (see legends). (a, c) Studies that measure multiple stages sequentially. Concentric circles show studies that start at each stage-bars are thick at the start of each circle, and become narrow as fewer studies measure later stages. (b, d) Studies that measure multiple stages, but not sequentially. Lines show when studies skip stages (e.g., d; thick line represents studies that measured larvae and juveniles, but not metamorphs). often (Figures 4a,b). However, we found the opposite trend in individual longitudinal studies, which measured metamorphs, juveniles, and F 1 adults most often (Figure 4c,d). Why do studies following individuals mostly measure stages after metamorphosis? Of the studies that used an individual longitudinal approach, 71% used species with lecithotrophic larvae, which have relatively short development times. Culturing individuals with long larval stages is less straightforward-for example, it is much more difficult to study the whole life history of the planktotrophic sea star Pisaster ochraceus, which matures in 5 years (Menge, 1975), compared to the lecithotrophic marine bryozoan Bugula neritina, which takes ~7 weeks to mature (Keough, 1989). An interesting next step would be to explore whether there is also a dearth of individual longitudinal studies in terrestrial taxa (e.g., insects, frogs).
Terrestrial groups have analogous life histories to marine invertebrates (e.g., indirect vs. direct development), so we expect those systems likely suffer from similar methodological limitations and biases, but this awaits testing.
Because of the challenge in executing individual longitudinal studies, phenotypic correlations between immature stages and F 1 adults remain obscure-only 12 individual longitudinal studies in our map measured both juveniles and F 1 adults, and none was done on species with long development times (i.e., planktotrophs). While planktotrophic species may be more difficult to study compared to lecithotrophs, there are excellent candidates for individual longitudinal studies that are in our dataset. For example, planktotrophic crustacean larvae, particularly decapods, are large, robust and have been cultured individually to the juvenile and/or adults stages in the laboratory Pansch et al. 2018; van Alstyne et al., 2014), and in the field (Lathlean & Minchinton, 2012). Our map suggests that rather than selecting species conducive to studying correlations between stages, we have prioritized studying a few model species that have small, less-resilient larvae, and are not ideal for individual longitudinal studies (e.g., echinoderms; Figure 5). While model species, such as Strongylocentrotus purpuratus (echinoderm) or Mytilus edulis (mollusk), are important, going forward, we recommend that biologists select study species that can be tracked individually for the entire life cycle to gain a more holistic view of correlations between stages, and how they affect fitness.
Perhaps unsurprisingly, only 12% of studies in the dataset were conducted under field conditions, and of the field studies, less than  (Reznick & Ghalambor, 2005).
Similarly, in a marine invertebrate, a laboratory experiment found there was the selection for mothers to produce small offspring, but in the field, there was the selection for small offspring in some cohorts and large in others (Monro et al., 2010). For marine invertebrates, studying certain life stages in the field may always be difficult, or may be restricted to certain species-for example, sessile species provide an opportunity to follow larvae to the juvenile and/or adult phase, because larvae can be settled on plates and then deployed in the field (Emlet & Sadro, 2006;Graham et al., 2013;Hettinger et al., 2013;Marshall et al., 2003;Phillips, 2002;). A more difficult task is following mobile species. There are a few examples of studies that followed free-swimming embryos and larvae in the field (Young, 1986) or kept them in screened cages (Basch & Pearse, 1996;Nedelec et al., 2014), and a study that reared larvae in the laboratory and transplanted mobile juveniles to the field by using protective mesh (Li & Chiu, 2013).
We encourage future studies to adapt and improve the methodologies we have discussed so that more species can be studied under natural conditions and we can understand the degree to which stages are linked.
Our systematic map shows that, for over 100 years, the field has done an excellent job in describing individual stages of marine invertebrate life histories, but this means that we have sacrificed understanding correlations between stages. Phenotypic correlations between stages have important implications for population dynamics (Burgess & Marshall, 2011;Taylor & Scott, 1997) and for how traits in different stages may evolve (Marshall & Morgan, 2011).
While studies following cohorts are an important first step for answering questions about development, they risk Simpson's Paradox.
The best approach for avoiding Simpson's Paradox is individual longitudinal studies, but these remain exceedingly uncommon, accounting for just 1.7% of studies in our map. While reviews in terrestrial systems have suggested that we move toward an individual longitudinal approach when studying life histories (Clutton-Brock & Sheldon, 2010;Nussey et al., 2008), we are one of the first to systematically quantify the frequency of experimental approaches. We expected to find that individual longitudinal studies are rare, but we note that an important part of science is to confirm the gravity of the problem. We acknowledge and celebrate the tremendous progress in the field of complex life cycles, but we hope that the issues we have identified here encourages and incentivizes future studies to use an individual longitudinal approach to understand the ecological and evolutionary significance of phenotypic correlations across lifehistory stages.

ACK N OWLED G M ENTS
We thank Annie Guillaume, Henry Wootton, and Victor Shelamoff for helping to collect data as primary reviewers. We also thank Liz Morris for edits of the manuscript.

CO N FLI C T O F I NTE R E S T S TATE M E NT
The authors declare no conflicts of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data and code are archived and available on Dryad: https://doi.

TA B L E A 3
Search strings that were trialed during the scoping stage following the six-stage Systematic Mapping Methodology from James et al. (2016) for the literature map of marine invertebrate life histories.
F I G U R E A 1 Flow diagram showing the selection process for articles included in the systematic map of marine invertebrate life histories. The searching, screening, eligibility, and inclusion processes were adapted from the Systematic Mapping Methodology (James et al., 2016). At the top, the number of articles identified from each of the four searches is provideddetails of the four searches, including search terms used, are provided in Table A4. After 131 duplicates were removed, 17,447 were screened at the title and abstract level, 6552 full-text articles were checked for eligibility, and ultimately 1716 articles were approved for the final literature map.

F I G U R E A 2 (a) Stages included in literature map of marine
invertebrate life histories for species with direct development. For each study, the life-history stages measured were recorded: (1) F 0 adult; (2) embryo; (3) juvenile; and (4) F 1 adult. (b) Frequency of each experimental design type identified in the literature map (n = 260 studies total). Study type is depicted in the life cycle above each column-single stage studies measure one stage in the life history; cohort longitudinal studies follow groups (i.e., "cohorts") of individuals across stages; and individual longitudinal studies follow individuals through multiple stages.

F I G U R E A 3
Frequency of studies that begin with each life-history stage, across the three experimental design approaches: (a) single stage; (b) cohort longitudinal (i.e., multiple stages following groups of individuals); (c) individual longitudinal (i.e., multiple stages following individuals).

F I G U R E A 4
Summary of life-history stages measured in (a, b) cohort longitudinal studies (i.e., multiple stages measured, following groups of individuals) and (c, d) individual longitudinal studies (i.e., multiple stages measured, following individuals). Thickness of bars represents the number of studies that measured each stage combination (see legends). (a, c) Concentric circles show studies that start at each stage-bars are thick at the start of each circle, and become narrow as fewer studies measure subsequent stages. (b) Lines show when studies skip stages (e.g., thick line represents studies that measured F 0 adults and juveniles, but not embryos). There were no individual longitudinal studies that skipped stages.